causal inference q-network
Causal Inference Q-Network: Toward Resilient Reinforcement Learning
Yang, Chao-Han Huck, Hung, I-Te Danny, Ouyang, Yi, Chen, Pin-Yu
However, most successful demonstrations of these DRL methods are usually trained and deployed under well-controlled situations. In contrast, real-world use cases often encounter inevitable observational uncertainty [Grigorescu et al., 2020, Hafner et al., 2018, Moreno et al., 2018] from an external attacker [Huang et al., 2017] or noisy sensor [Fortunato et al., 2018, Lee et al., 2018]. For examples, playing online video games may experience sudden black-outs or frame-skippings due to network instabilities, and driving on the road may encounter temporary blindness when facing the sun. Such an abrupt interference on the observation could cause serious issues for DRL algorithms. Unlike other machine learning tasks that involve only a single mission at a time (e.g., image classification), an RL agent has to deal with a dynamic [Schmidhuber, 1992] and encoded state [Schmidhuber, 1991, Kaelbling et al., 1998] and to anticipate future rewards. Therefore, DRL-based systems are likely to propagate and even enlarge risks (e.g., delay and noisy pulsed-signals on sensor-fusion [Yurtsever et al., 2020, Johansen et al., 2015]) induced from the uncertain interference.